Classification of MEDLINE Abstracts
نویسندگان
چکیده
This paper provides the preliminary result in our experiments to automatically assign MeSH terms to MEDLINE abstracts. Every year about 100,000 documents are added to MEDLINE, index terms are assigned by hand to each document from a controlled vocabulary called MeSH. This is necessarily time consuming and may lead to inconsistent indexing due to the large size of MeSH. Our purpose is to explore the feasibility of automating this indexing. To achieve the purpose, we apply two documents classification methods, based on SVMV [1] and AdaBoost [4], which show good results in classification of news corpora and analyze their results. We assumed a class consists of the abstracts which have the same MeSH term. Although MeSH terms have a hierarchical structure, each class is regarded to be independent. We used MeSH terms previously assigned by specialists as answer and compared the answer with the assigned MeSH term by application of SMVM and AdaBoost.
منابع مشابه
Classification of Clinically Useful Sentences in MEDLINE
OBJECTIVE In a previous study, we investigated a sentence classification model that uses semantic features to extract clinically useful sentences from UpToDate, a synthesized clinical evidence resource. In the present study, we assess the generalizability of the sentence classifier to Medline abstracts. METHODS We applied the classification model to an independent gold standard of high qualit...
متن کاملInformation Extraction and Sentence Classification applied to Clinical Trial MEDLINE Abstracts
In this paper, firstly we report experimental results on applying information extraction (IE) methodology to the task of summarizing clinical trial design information in focus on “Compared Treatment”, “Endpoint” and “Patient Population” from clinical trial MEDLINE abstracts. From these results, we have come to see this problem as one that can be decomposed into a sentence classification subtask...
متن کاملDeveloping an Ontology for Encoding Disease Treatment Information in Medical Abstracts
A disease-treatment ontology is being developed to model and represent treatment information found in medical abstracts. Treatment information extracted from medical abstracts and medical articles can then be encoded in this ontology and used for information retrieval, question-answering, summarisation and knowledge discovery. This paper explains the initial version of the ontology developed ba...
متن کاملAutomatic Classification of PubMed Abstracts with Latent Semantic Indexing: Working Notes
The 2014 BioASQ challenge 2a tasks participants with assigning semantic tags to biomedical journal abstracts. We present a system that uses Latent Semantic Analysis to identify semantically similar documents in MEDLINE to an unlabeled abstract, and then uses a novel ranking scheme to select a list of MeSH headers from candidates drawn from the most similar documents. Our approach achieved good ...
متن کاملStructured abstracts in MEDLINE, 1989-1991.
OBJECTIVE To characterize the structured abstracts in biomedical journals indexed in MEDLINE over a three-year period as an initial step in exploring their utility in enhancing bibliographic retrieval. DESIGN The study examined the occurrence of structured abstracts in MEDLINE from March 1989 to December 1991, characteristics of MEDLINE records for articles with structured abstracts, editoria...
متن کامل